Bias and Statistical Significance in Evaluating Speech Synthesis with Mean Opinion Scores
نویسندگان
چکیده
Listening tests and Mean Opinion Scores (MOS) are the most commonly used techniques for the evaluation of speech synthesis quality and naturalness. These are invaluable in the assessment of subjective qualities of machine generated stimuli. However, there are a number of challenges in understanding the MOS scores that come out of listening tests. Primarily, we advocate for the use of non-parametric statistical tests in the calculation of statistical significance when comparing listening test results. Additionally, based on the results of 46 legacy listening tests, we measure the impact of two sources of bias. Bias introduced by individual participants and synthesized text can a dramatic impact on observed MOS scores. For example, we find that on average the mean difference between the highest and lowest scoring rater is over 2 MOS points (on a 5 point scale). From this observation, we caution against using any statistical test without adjusting for this bias, and provide specific non-parametric recommendations.
منابع مشابه
Attribution Bias in schizophrenian patients who have auditory hallucination
Introduction: Concerning cognitivism, psychotic experiences (hallucination) of schizophrenic patiets have been hypothesized to originate from a fundamentally cognitive biases. Methods: To explor the idea that attribution bias may underlin appearance of auditory hallucination, in the current descriptive study, a source-monitoring task were used to compare healthy controles with relatives of indi...
متن کاملStudy on Unit-Selection and Statistical Parametric Speech Synthesis Techniques
One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...
متن کاملEffect of Aromatherapy with Lavender 10% Essential Oil on Motor Function, Speech and Delirium in Patients with Acute Thrombotic Cerebral Ischemia
Background: Stroke is one of the most disabling diseases worldwide. Herbal medicines, especially lavender, have been used to treat ischemic diseases today. Objectives: The aim of our study was to investigate the effect of aromatherapy with lavender 10% essential oil on motor function, speech and delirium in acute thrombotic cerebral ischemia patients. Materials & Methods: In this double bli...
متن کاملAn investigation of the application of dynamic sinusoidal models to statistical parametric speech synthesis
This paper applies a dynamic sinusoidal synthesis model to statistical parametric speech synthesis (HTS). For this, we utilise regularised cepstral coefficients to represent both the static amplitude and dynamic slope of selected sinusoids for statistical modelling. During synthesis, a dynamic sinusoidal model is used to reconstruct speech. A preference test is conducted to compare the selectio...
متن کاملDeveloping and Evaluating the Validity and Reliability of the Knowledge, Attitude, and Practice Questionnaire of Iranian Mothers about the Development of Communication, Language, Speech, and Swallowing of Persian-Speaking Children Aged 18 to 36 Month
Background and Objectives: The mother's knowledge and attitude about the child's developmental norms can affect their practice and the quality of parent-child interaction. The quality of a child's development in the early years significantly impacts their personality and future success. Therefore, this study aims to develop and investigate the psychometric characteristics of the Iranian mother'...
متن کامل